


随着大数据时代的到来,深度学习模型已经在图像分类、文本分类等任务中取得了先进成果。但深度学习模型的成功,很大程度上依赖于大量训练数据。而在现实世界的真实场景中,某些类别只有少量数据或少量标注数据,而对无标签数据进行标注将会消耗大量的时间和人力。与此相反,人类只需要通过少量数据就能做到快速学习。小样本学习(few-shot learning)[2,3]的概念被提出,使得机器学习更加靠近人类思维.


2018年,ULMFit微调语言模型:该模型分为 3 个阶段:(1) 语言模型预训练;(2) 语言模型微调;(3) 分类器微调.该模型的创新点在于改变学习速率来微调语言模型。[1] Howard J, Ruder S. Universal language model fine-tuning for text classification. arXiv preprint arXiv:1801.06146, 2018.
2019年,另一种微调模型:主要包含以下几个机 制:(1) 在小样本类别上再训练的过程使用更低的学习率;(2) 在微调阶段使用自适应的梯度优化器;3) 当源数据集和目标数据集之间存在较大差异性时,可以通过调整整个网络来实现. [2] Nakamura A, Harada T. Revisiting fine-tuning for few-shot learning. arXiv preprint arXiv:1910.00216, 2019.


小样本学习的根本问题在于样本量过少,从而导致样本多样性变低。在数据量有限的情况下,可以通过数据增强(data augmentation)来提高样本多样性。本文将将基于数据增强的方法分为基于无标签数据、基于数据合成和基于特征增强的方法三种。
2.1 基于无标签数据的方法:利用无标签数据对小样本数据集进行扩充。
2016 年,Wang 等人:Wang YX, Hebert M. Learning from small sample sets by combining unsupervised meta-training with CNNs. In: Advances in Neural Information Processing Systems. 2016. 244−252.
2018年,改进MAML+半监督学习:Boney R, Ilin A. Semi-supervised few-shot learning with MAMLl. In: Proc. of the ICLR (Workshop). 2018.
2018年,改进原型网络+无标签数据:Ren MY, Triantafillou E, Ravi S, et al. Meta-learning for semi-supervised few-shot classification. arXiv preprint arXiv:1803. 00676, 2018.
2019年,转导传播网络(transductive propagation network) Liu Y, Lee J, Park M, et al. Learning to propagate labels: Transductive propagation network for few-shot learning. arXiv preprint arXiv:1805.10002, 2018.
2019年,交叉注意力网络:Hou RB, Chang H, Ma BP, et al. Cross attention network for few-shot classification. In: Advances in Neural Information Processing Systems. 2019. 4003−4014.
2.2 基于数据合成的方法
生成对抗网络(GAN):Mehrotra A, Dukkipati A. Generative adversarial residual pairwise networks for one shot learning. arXiv preprint arXiv:1703.08033, 2017.
表示学习+小样本学习:在含有大量数据的源数据集上学习通用的表示模型,之后在少量数据新类别中微调模型。Hariharan B, Girshick R. Low-shot visual recognition by shrinking and hallucinating features. In: Proc. of the IEEE Int’l Conf. on Computer Vision. 2017. 3018−3027.
元学习+数据生成:通过数据生成模型生成虚拟数据来扩充样本的多样性, 结合元学习方法,通过端到端方法共同训练生成模型和分类算法.Wang YX, Girshick R, Hebert M, et al. Low-shot learning from imaginary data. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018. 7278−7286.
变分编码器(VAE)+ GAN:充分利用了两者的优势集成了一个新的网络 f-VAEGAN-D2. Xian Y, Sharma S, Schiele B, et al. f-VAEGAN-D2: A feature generating framework for any-shot learning. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2019. 10275−10284.
元学习:利用元学习对训练集的图像对支持集进行插值,形成扩充的支持集集合 Chen Z, Fu Y, Kim YX, et al. Image deformation meta-networks for one-shot learning. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2019. 8680−8689.
2.3 基于特征增强的方法
2017,AGA模型:学习合成数据的映射,使样本的属性处于期望的值或强度. Dixit M, Kwitt R, Niethammer M, et al. AGA: Attribute guided augmentation. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2017. 7455−7463.
特征迁移网络(FATTEN):用于描述物体姿态变化引起的运动轨迹变化 Liu B, Wang X, Dixit M, et al. Feature space transfer for data augmentation. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018. 9090−9098.
Delta 编码器:通过看到少量样本来为不可见的类别合成新样本,将合成样本 用于训练分类器.该模型既能提取同类训练样本之间可转移的类内变形,也能将这些增量应用到新类别的小样本中,以便有效地合成新类样本. Schwartz E, Karlinsky L, Shtok J, et al. Delta-encoder: An effective sample synthesis method for few-shot object recognition. In: Advances in Neural Information Processing Systems. 2018. 2845−2855.
双向网络TriNet:图像的每个类别在语义空间中具有更丰富的特征,所以通过标签语义空间和图像特征空间的相互映射,可以对图像的特征进行增强 Chen Z, Fu Y, Zhang Y, et al. Semantic feature augmentation in few-shot learning. arXiv preprint arXiv:1804.05298, 2018.
对抗特征:提出可以把固定的注意力机制换成不确定的注意力 机制 M.输入的图像经提取特征后进行平均池化,分类得到交叉熵损失 l.用 l 对 M 求梯度,得到使 l 最大的更新方向从而更新 M. Shen W, Shi Z, Sun J. Learning from adversarial features for few-shot classification. arXiv preprint arXiv:1903.10225, 2019.
1) 更好地利用无标注数据::由于真实世界中存在着大量的无标注数据,不利用这些数据会损失很多信息,更好、更合理地使用无标注数据,是一个非常重要的改进方向.
2) 更好地利用辅助特征:小样本学习中,由于样本量过少导致特征多样性降低.为提高特征多样性,可利用辅助数据集或者辅助属性进行特征增强,从而帮助模型更好地提取特征来提升分类的准确率。


3.1 度量学习
孪生神经网络(siamese neural network):孪生神经网络从数据中学习度量,进而利用学习到的度量比较和匹配未知类别的样本,两个孪生神经网络共享一套参数和权重. Koch G, Zemel R, Salakhutdinov R. Siamese neural networks for one-shot image recognition. In: Proc. of the ICML Deep Learning Workshop. 2015
匹配网络:Vinyals O, Blundell C, Lillicrap T, et al. Matching networks for one shot learning. In: Advances in Neural Information Processing Systems. 2016. 3630−3638.
LSTM+匹配网络:Jiang LB, Zhou XL, Jiang FW, Che L. One-shot learning based on improved matching network. Systems Engineering and Electronics, 2019,41(6):1210−1217
多注意力网络模型:Wang P, Liu L, Shen C, et al. Multi-attention network for one shot learning. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2017. 2721−2729.
原型网络(prototypical networks):Snell J, Swersky K, Zemel RS. Prototypical networks for few-shot learning. In: Advances in Neural Information Processing Systems. 2017. 4077−4087.
基于人工注意力的原型网络:Gao TY, Han X, Liu ZY, Sun MS. Hybrid attention-based prototypical networks for noisy few-shot relation classification. In: Proc. of the AAAI Conf. on Artificial Intelligence. 2019. 6407−6414.
层次注意力原型网络(HAPN):Sun SL, Sun QF, Zhou K, Lv TC. Hierarchical attention prototypical networks for few-shot text classification. In: Proc. of the Conf. on Empirical Methods in Natural Language Processing and the 9th Int’l Joint Conf. on Natural Language Processing (EMNLP-IJCNLP). 2019. 476−485.
关系网络:Sung F, Yang Y, Zhang L, et al. Learning to compare: Relation network for few-shot learning. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018. 1199−1208.
深度比较网络:Zhang X, Sung F, Qiang Y, et al. Deep comparison: Relation columns for few-shot learning. arXiv preprint arXiv:1811.07100, 2018.
Hilliard N, Phillips L, Howland S, et al. Few-shot learning with metric-agnostic conditional embeddings. arXiv preprint arXiv:1802.04376, 2018.
协方差度量网络(CovaMNet):Li W, Xu J, Huo J, Wang L, Yang G, Luo J. Distribution consistency based covariance metric networks for few-shot learning. In: Proc. of the AAAI Conf. on Artificial Intelligence. 2019. 8642−8649.
深度最近邻神经网络(DN4):Li W, Wang L, Xu J, Huo J, Gao Y, Luo J. Revisiting local descriptor based image-to-class measure for few-shot learning. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2019. 7260−7268.
Li H, Eigen D, Dodge S , et al. Finding task-relevant features for few-shot learning by category traversal. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2019. 1−10.

3.2 基于元学习的方法

元学习的目的是让模型获得一种学习能力,这种学习能力可以让模型自动学习到一 些元知识。元知识指在模型训练过程之外可以学习到的知识,比如模型的超参数、神经网络的初始参数、神经网络的结构和优化器等。
神经图灵机:Graves A, Wayne G, Danihelka I. Neural turing machines. arXiv preprint arXiv:1410.5401, 2014.
基于记忆增强的神经网络(MANN):Santoro A, Bartunov S, Botvinick M, et al. One-shot learning with memory-augmented neural networks. arXiv preprint arXiv: 1605.06065, 2016.
元网络:Munkhdalai T, Yu H. Meta networks. International Conference on Machine Learning. In: Proc. of the PMLR. 2017. 2554−2563.
未知模型的元学习方法(MAML):Finn C, Abbeel P, Levine S. Model-agnostic meta-learning for fast adaptation of deep networks. In: Proc. of the 34th Int’l Conf. on Machine Learning, Vol.70. 2017. 1126−1135.
未知任务元学习法(TAML):Jamal MA, Qi GJ. Task agnostic meta-learning for few-shot learning. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2019. 11719−11727.
基于注意力机制的未知任务元学习法(ATAML):Xiang J, Havaei M, Chartrand G, et al. On the importance of attention in meta-learning for few-shot text classification. arXiv preprint arXiv:1806.00852, 2018.
MAML改进:Sun Q, Liu Y, Chua TS, et al. Meta-transfer learning for few-shot learning. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2019. 403−412.
MAML改进:Liu Y, Sun Q, Liu AA, et al. LCC: Learning to customize and combine neural networks for few-shot learning. arXiv preprint arXiv:1904.08479, 2019.
任务感知特征嵌入网络(TAFE-Net):Wang X, Yu F, Wang R, et al. TAFE-Net: Task-aware feature embeddings for low shot learning. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2019. 1831−1840.
利用优化器的元学习模型:Ravi S, Larochelle H. Optimization as a model for few-shot learning. In: Proc. of the ICLR. 2016.
基于注意力机制的权重生成器:Gidaris S, Komodakis N. Dynamic few-shot visual learning without forgetting. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018. 4367−4375.
多任务聚类的元学习法:Yu M, Guo X, Yi J, et al. Diverse few-shot text classification with multiple metrics. In: Proc. of the NAACL-HLT. 2018. 1206−1215.

3.3 基于图神经网络的方法

GNN:Garcia V, Bruna J. Few-shot learning with graph neural networks. In: Proc. of the Int’l Conf. on Learning Representations. 2018.
EGNN:Kim J, Kim T, Kim S, et al. Edge-labeling graph neural network for few-shot learning. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2019. 11−20.
GNN+DAE:Gidaris S, Komodakis N. Generating classification weights with GNN denoisingautoencoders for few-shot learning. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2019. 21−30.


(1) Omniglot 包含50 个字母的1623 个手写字符,每一个字符都是由20 个不同的人通过亚马逊的Mechanical Turk在线绘制的.
(2) miniImageNet 是从ImageNet分割得到的,是ImageNet 的一个精缩版本,包含ImageNet 的100 个类别,每个类别含有600 个图像.一般64 类用于训练,16 类用于验证,20 类用于测试.
(3) tieredImageNet 是Mengye 等人在2018 年提出的新数据集,也是ImageNet 的子集.与miniImageNet不同的是,tieredImageNet 中类别更多,有608 种.
(4) CUB(caltech-UCSD birds)是一个鸟类图像数据集,包含200 种鸟类,共计11788 张图像.一般130 类用于训练,20 类用于验证,50 类用于测试.
(5) CIFAR-100 数据集:共100 个类,每个类包含600 个图像,分别包括500 个训练图像和100 个测试图像.CIFAR-100 中的100 个子类所属于20 个父类,每个图像都带有一个子类标签和一个父类标签.
(6) Stanford Dogs:一般用于细粒度图像分类任务.包括120 类狗的样本共计20580 个图像,一般70 类用于训练,20 类用于验证,30 类用于测试.
(7) Stanford Cars:一般用于细粒度图像分类任务.包括196 类车的样本共计16185 个图像,一般130 类用于训练,17 类用于验证,49 类用于测试.








